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We analyze the variability in the X-ray lightcurves of the 
black hole candidate Cygnus X-1 by linear and nonlinear time 
series analysis methods. While a linear model describes the 
over-all second order properties of the observed data well, 
surrogate data analysis reveals a significant deviation from 
linearity. We discuss the relation between shot noise models 
usually applied to analyze these data and linear stochastic 
autoregressive models. We debate statistical and interpreta- 
tional issues of surrogate data testing for the present context. 
Finally, we suggest a combination of tools from linear and 
nonlinear time series analysis methods as a procedure to test 
the predictions of astrophysical models on observed data. 

PACS: 05.40.-fj, 02.50.Wp, 97.80.Jp 

I. INTRODUCTION 

Cygnus X-1 is one of the best established black hole 
candidates. Mass accretion from its primary HDE 226868 
leads to X-ray emission which exhibits a variability on 
time scales of tenths of seconds |^ up to months |^ . The 
shorttime variability is assumed to be caused by instabili- 
ties of the accretion disk and is usually formally described 
by shot noise models ^-||] which are a specific kind of 
point processes. These models are inspired by hypothe- 
ses about the physics of the accretion process and the 
processing of X-rays by Comptonization in the neighbor- 
hood of the black hole. Free parameters of these models, 
like morphology and distribution of the shots, are usually 
tuned to fit the observed energy or power spectra. 

On the other hand, starting from the observed data 
and characterizing the dynamical structure of this ob- 
served variability by time series analysis methods might 
yield valuable constraints on astrophysical models. This 
characterization can be, for example, a fit of an explicit 
model to the data or the extraction of a feature which 
captures some typical structure of the dynamics. Such 
a characterization could either inspire new astrophysical 
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models or could be used for additional tests of the pre- 
dictions of existing models. Of course, there is no direct 
way for a characterization neither by modeling nor by 
feature extraction of observed data to an astrophysical 
model: On the one hand, although the goodness-of-fit of 
a diagnostic model can be evaluated by statistical tests, 
these tests might have low diagnostic power to detect a 
misspecification of the model. On the other hand, a cer- 
tain feature discovered in the data might be generated 
by many different types of dynamics. Therefore, before 
drawing conclusions about the underlying process from 
data analysis, different independent approaches should 
be used and the plausibility of a fitted model or an ex- 
tracted feature should be judged in the light of astro- 
physical knowledge. 

The first step of nonlinear time series analysis is usu- 
ally to study the structure of a possible underlying at- 
tractor. However, methods from nonlinear dynamics did 
not succeed to establish a low-dimensional attractor for 
X-ray lightcurves of Cygnus X-1 |^. It is also important 
to mention that time series analysis methods usually as- 
sume that the underlying process presents a dynamical 
system in contrast to a shot noise model. 

As an alternative to the commonly applied shot noise 
models, the linear state space model (LSSM) as a general- 
ization of dynamical linear autoregressive models includ- 
ing the observational noise has been proposed to model 
the X-ray variability of active galactic nuclei in Q . Two 
attractive properties of this approach are, firstly, that the 
LSSM can be fitted to the data in the time domain and, 
secondly, that it explicitly takes the observational noise 
covering the dynamics into account. The state space 
model has been applied to data from Cygnus X-1 in its 
low state This analysis has revealed that a first or- 
der autoregressive process describes the dynamics of the 
X-ray variability well. This predicts a shot noise model 
with an exponential decay and a very specific mode of 
excitation of these shots. 

In this contribution, we analyze X-ray lightcurves of 
Cygnus X-1 from its low and intermediate state by the 
LSSM as well as by a method which is able to capture 
deviations from linearity. In accordance with |^ , a scalar 
LSSM results in a fit that explains the linear correlations 
of the time series well. However, the nonlinear analy- 
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sis using a measure for time reversibility of the process, 
reveals strong deviations from linearity on exactly that 
dynamical time-scale found by the LSSM. To interpret 
this result consistently, we discuss the mathematical and 
astrophysical implications of linear stochastic and shot 
noise models. 

Finally, we suggest a combination of tools from linear 
and nonlinear time series analysis methods as a proce- 
dure to test the predictions of astrophysical models on 
observed data. 

The organization of the paper is as follows: In Sec- 
tion U we introduce the data under investigation. In 
Section |ll| we discuss shot noise and linear stochastic 
models and their relation. Furthermore, we explain how 
we use the method of surrogate data to test for time 
reversibility. Sectio n_fV| presents the results, which are 
discussed in Section 



II. THE DATA 

The data were recorded with the Proportional Counter 
Array (PCA) on board the Rossi X-ray Timing Explorer 
(RXTE). The X-ray activity of Cygnus X-1 is classi- 
fied as low, intermediate, and high, depending on the 
mean count rate |9|. Our analysis is based on two data 
sets: The first data set was recorded on 22nd May 1996, 
19:05:12 - 19:48:02, while Cygnus X-1 was in its inter- 
mediate state The energy range was 2.0 - 14.1 keV 
(channel range: 0-35). The sampling frequency was 256 
Hz and the data set consists of 655,360 data points. The 
mean number of counts per bin was 38.3 with standard 
deviation 10.0. The second data set was recorded on 
12th February 1996, 9:37:20 - 10:03:06, while Cygnus X- 
1 was in its low state. The energy range was 2.0 - 9.9 
keV (channel range: 0-35). The sampling frequency was 
256 Hz and the data set consists of 394,752 data points. 
The mean number of counts per bin was 18.7 with stan- 
dard deviation 7.1. Figure || displays a 3 s segment of 
the first data set. A part of the variability of the data 
is explained by the fact that the recording process is a 
counting process. This leads to additive uncorrelated ob- 
servational noise which is Poisson distributed. Due to the 
high mean count rate this Poisson noise is well approxi- 
mated by Gaussian noise. 



III. METHODS 

A. Shot noise processes 

Shot noise processes are a specific type of point pro- 
cesses |l^. Point processes are characterized by a prob- 
abilistic law that some event happens at a certain time. 
For the simplest form of a shot noise model the proba- 
bilistic law of occurrence of events follows a Poisson pro- 
cess and the event is an exponential decay with initial 



value M and decay time r. A Poisson process is defined 
by the property that the probability of an event to take 
place in a time interval (t, t ^ At) is proportional to Ai 
in the limit of small intervals: 



lim prob (Event in(t, t 

At^Q 



At)) = pAt 



(1) 



where p denotes the intensity of the process. The sampled 
time series consists of a superposition of the single shots 
at times Tj whose occurrence follows Eq. (|l]), i.e.. 



AU-T,}/t 



(2) 



with e(z) = 1 if z > 0, e(z) = if z < 0. We call this 
process the classical shot noise process. 

The power spectrum of this process (Gl) is given by [0 : 



Sico) 



1/t2 



w ^ 0. 



(3) 



The classical shot noise has already been proposed 
in Ref. ||^ to describe the observed variability of the 
lightcurves of Cygnus X-1. It consists of exponentially 
decaying shots with fixed initial value which occur in 
time with a constant rate of probability. Several gen- 
eralizations have been proposed: Shots with a decay rate 
drawn from a certain distribution have been suggested in 
P, P^Jl3| . A distribution for the initial values of the shots 
was considered in |jl^ . Vikhlinin et al. ||l^ introduced in- 
teractions between different shots. Furthermore, the sim- 
ple exponential form was replaced by more complicated 
time courses showing an initial increase from zero to a 
maximum value followed by a decay to zero |^]. These 
types of profiles arc supported by Monte Carlo simula- 
tions of astrophysical models of the X-ray processing by 
spatially resolved Comptonization in a cloud of hot elec- 
trons surrounding the accretion disk 

For some generalized shot noise mod( 
tra can be calculated analytically otherwise they 

have to be estimated from simulated data. 



els the power spec- 



B. Linear stochastic dynamical systems 

In contrast to shot noise processes given by Eqs. (|]j|), 
continuous dynamical systems are given by a differential 
equation 



X = fix, e) , 



(4) 



where e denotes random perturbations which might influ- 
ence the time evolution of the dynamics. An attractive 
feature of modeling time series by dynamical systems is 
that the specific form of / (x, e) might provide insight in 
the physics at work, see |0,18[ for two examples from 
physics and jl^:^^ for application to physiological time 
series. 
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In the simplest case if /(.) is linear in x and the dy- 
namical noise e'is Gaussian distributed and additive, the 
system represents linear combinations of damped oscil- 
lators and relaxators that are driven by Gaussian noise. 
Since the model is linear, all information about the model 
is captured by the power spectrum. For a scalar dynam- 
ics: 



WN{0,a'^) , 



the spectrum is given by: 



(5) 



(6) 



1. The variance of the prediction residuals does not de- 
crease significantly for larger model dimensions. 

2. The spectra calculated from the fitted LSSM for larger 
model dimensions coincide. 

3. An appropriate model should turn the correlations in 
the data into prediction residuals consistent with white 
noise. In the frequency domain this hypothesis can be 
tested by comparing the periodogram of the residuals 
with the expected straight line in the case of white noise 
by the Kolmogorov-Smirnov test l22| . 



C. Noise reduction 



It is important to emphasize that first order lin- 
ear stochastic dynamical systems have the same uj- 
dependence of the spectrum as the classical shot noise 
model, see Eq. (||). 

Most often, x cannot be observed directly, but only 
a scalar function g{x). Furthermore, the observation y 
might contain additive measurement noise, denoted by 
rj: 



V = 9{x) + ?7 



(7) 



While the noise e* in Eq. (|j) drives the dynamics, the 
measurement noise 77 in Eq. only disturbs the obser- 
vation of the system. For the case of a linear dynamical 
system, Eq. (|5|), with white additive observational noise 
of variance i?, the spectrum reads: 



S{lo) 



R 



(8) 



Since measured data are sampled, discrete time dynami- 
cal models 



x{t) = h{x{t - At),e{t)) 



(9) 



are often used. If both the dynamical and the measure- 
ment noise are Gaussian distributed, and the functions h 
and g are linear, i.e.. 



x{t) = Ax{t - At) + e{t), e{t) - iV(0, Q) 
y{t) = Cx{t)+r^{t), 77(0 ~iV(0,i?) 



(10) 



the linear state space model (LSSM) as a generalization 
of the well known autoregressive (AR) models results. 
They represent discrete time versions of the continuous 
time linear stochastic models. The matrix A determines 
the dynamics of the unobserved state vector x{t). Its 
dimension reflects the order of the process. The vector 
C maps the state vector to the observation. In the case 
of a scalar dynamics, A is related to the relaxation time 
scale T by T = — l/log|yl|. The mathematical formalism 
of the LSSM and procedures to estimate its parameters 
are described in detail in [l9|pl[ |. 

To test the consistency of a fitted model with the data, 
at least three criteria should be applied. 



Measured time series of natural systems often contain 
a large amount of additive observational noise. The fit- 
ted LSSM can be applied as a linear filter to perform a 
noise reduction on the data even if it is misspecified as a 
dynamical model of the underlying process. If the LSSM 
describes the second order properties of the process cor- 
rectly, the LSSM is the optimal linear filter pi| . 

Algorithmically the noise reduction is achieved by first 
applying the Kalman filter, which yields an estimate of 
x{t) based on the observed data j/(l), 2/(2), . . . ,y(t). Then 
the so-called smoothing filter is applied backwards in 
time to obtain estimates x{t) based on the whole data 
set |2^. The possibility to apply this smoothing filter 
relies on the property of lin ear sto chastic processes to be 
time reversible, see Section HID. Multiplication of x{t) 



by the estimated C yields an estimate of the noise-free 
scalar observable y{t). 

The statistical properties of the estimated y{t) can be 
understood in the frame of Bayesian estimation, see p^ ] 
for a detailed discussion. The model with its fitted pa- 
rameters represents a prior on the smoothness of the hid- 
den x{t). Gonditioned on this prior a maximum likeli- 
hood estimate of y{t) is obtained. The estimated time 
series is the most probable one assuming the validity of 
the model, Eq. ([^. 

It should be emphasized that the estimated time se- 
ries does not represent a typical realization of the fitted 
model used as prior. Even if the fitted model is the true 
one, the estimated time course is a slightly low-pass fil- 
tered version of a typical realization. If the fitted model 
is, however, not the true model, the estimated time se- 
ries will show statistical properties which, literally spo- 
ken, lie between those of the process which generated the 
data and the model used as prior. Especially, if the true 
process is nonlinear showing a strong time irreversibility, 
this quantity might be reduced for the estimated time se- 
ries. Thus, the procedure does not lead to false positive 
results. 
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D. The relation between linear models and shot 
noise models 

Linear autoregressive and shot noise processes are both 
stochastic processes. The randomness driving these pro- 
cesses usually reflects the restricted knowledge about the 
dynamics at work. Often, the dynamics is exposed to 
numerous influences that cannot be taken into account 
explicitly. Even if these influences are deterministic in 
nature they effectively act as random influences due to 
their large number. The characteristic difference between 
autoregressive and shot noise processes is the way the 
randomness enters the process: i) In dynamical processes 
it describes a random force that influences the dynamics 
in every instant of time, ii) In point processes it acts as 
a trigger that generates a certain event only at certain 
points in time. 

However, there is a formal connection between the clas- 
sical shot noise process and the scalar linear stochastic 
dynamical process. Formally, and "not in the spirit of 
point processes" , one can transform Eq. (^) into 

x{t) = {1- At/T)x{t~ At) + e{t) , (11) 

where e{t) has the specific form: 

, , J with probability 1 — pAt , . 

^•■^ " \ M with probability pM ' ^ > 

Thus, for pAi w 1 and M following a Gaussian distri- 
bution, there is a formal equivalence between the scalar 
linear autoregressive process and the classical shot noise 
process which is characterized by its exponentially de- 
caying shot profile. In practice At corresponds to the 
sampling interval. The condition pAt k, 1 means that 
the process is highly undersampled, since single shots are 
not resolved. The required Gaussianity of the distribu- 
tion of the initial values of the shots does not meet the 
physical constraint of positivity in the astrophysical con- 
text of X-ray bursts. In the limit pAt « 1 it might be 
an effective description resulting from the superposition 
of the unresolved Poisson process. 

In summary, scalar linear dynamical processes are a 
certain formal limiting case of shot noise models. Only 
in the case of linearity, there is no interaction between 
the excitations and time course of the shots. It should 
be noted that, in general, nonlinear stochastic dynamical 
systems cannot be formulated as a formal limit of shot 
noise models. 



E. Beyond linear models: Time irreversibility 

An important property of linear Gaussian processes 
is time reversibility, i.e., the statistical properties of the 
process are the same forward and backward in time [p4[ . 
An intuitive explanation is that the statistical proper- 
tics of these processes are completely captured by the 



autocorrelation function, which is by definition symmet- 
ric under time reversal. Shot noise processes with non- 
symmetric shot profiles are not time reversible as are 
many nonlinear dynamical systems. The Gaussianity of 
the noise e(t) of a linear autoregressive process is crucial 
for the time reversibility. Any deviation from Gaussianity 
leads to time irreversibility even in the case of linear dy- 
namics ||2j] . This is of special interest in view of Eq. jl^). 
While time reversibility has been used to test for nonlin- 
earity in dynamical systems, ]25|-|28t, we will use it here 
as an indicator for a shot noise model. A test for time 
irreversibility in this context will be discussed in the next 
section. 



F. Nonlinear analysis: The method of surrogate data 

The theory of nonlinear dynamical systems offers 
notions to characterize processes beyond linearity, see 
p9| , p0[ for a review. Different quantities have been in- 
vented to reveal whether an observed time series is a 
realization of a chaotic system; among others, the cor- 
relation dimension | |3l| ], Lyapunov exponents |32[ , and 
nonlinear forecasting errors |33[ . It has been observed 
later that due to the finite size of data, noise, and linear 
correlations, the algorithms to calculate these quantities 
can give false positive results. 

To test the reliability of the results, the method of sur- 
rogate data has been invented independently by different 
authors, e.g. p^-^8|, but has been made most popular 
by [ p5[ . It has found wide applications in the analysis 
of astrophysical [|6|,|9|-|4|] , geophysical |4|-|4|] and bio- 
physical 1^-0 data. 

The general idea is to simulate time series whose sta- 
tistical properties are constrained to the null hypothesis 
one wants to test for |4^. In testing for linearity this is 
achieved by randomizing the phases of the Fourier trans- 
form of the data and transforming the result back to the 
time domain. A possible static nonlinearity in the ob- 
servation, g{x) in Eq. (|^), is known to produce spurious 
significant results ||49|] . Therefore, a proper adjustment of 
the distribution of the time series data is performed. For 
many realizations of time series from this procedure, the 
same algorithm as to the original data is applied lead- 
ing to a distribution of the feature calculated by the al- 
gorithm assuming linearity. A significant difference be- 
tween the distribution of the feature produced by the 
algorithm for the surrogate data and the original data is 
taken as an indication that the process underlying the 
original is not a Gaussian, stationary, stochastic, linear 
one. A significant result of the test does not necessarily 
indicate chaoticity of the process, since this is only one 
possibility to violate the null hypothesis. 

Former analysis revealed that it is unlikely that the 
Cygnus X-1 as well as other comparable X-ray sources 
represent a low-dimensional chaotic system |]^, po| , ^ . 
Therefore, we apply the surrogate data test to look for 
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deviations from the null hypothesis in general. 

The results of the surrogate data test for a feature / 
are usually reported as significance S : 



5 = 



(13) 



where {f)^^^ denotes the mean of the distribution of the 
feature for the surrogates and (Jsurr its standard devia- 
tion. Assuming a Gaussian distribution for the feature 
a value of S = 2.6 corresponds to a significance level of 
a = 0.01. 

We propose here a surrogate data analysis based on 
time reversibility. Generalizing a suggestion of Weiss Q , 
a simple measure denoted by Q{m) for a deviation from 
reversibility for a certain time lag m was introduced 
in 12^1: 



Q(m) 



{{x{t 



(14) 



More complex measures for time irreversibility based on 
conditional, respectively joint probability distributions 
are described in |^-^. 

Since it is not clear beforehand at which lag m a pos- 
sible deviation from the null hypothesis might result in a 
significant Q{m) statistics, the significances S{m) will be 
evaluated for all lags up to a maximum lag. This leads 
to the statistical problem of multiple testing. It is impor- 
tant to emphasize that this has an impact on the level 
of significance a, i.e., the probability to reject the null 
hypothesis although it is true. If the null hypothesis is 
tested in n independent tests at the level a, the proba- 
bility to reject the null hypothesis at least once is given 

by 



l-(l-a) 



(15) 



For example, for a = 0.01 and n = 10, the actual sig- 
nificance level a is 0.1, leading to a ten times higher 
probability for an incorrect rejection of the null hypoth- 
esis than expected. A simple cure to this problem is the 
Bonferroni-correction |^^. Therefore, Eq. ( p^ is solved 
for a: 



1 - (1 -a)i/" 



(16) 



Since a <C 1, the right hand side of Eq. (|T^) can be 
approximated in first order, resulting in the simple rule: 



a/n 



(17) 



This procedure is known to be extremely conservative, 
i.e., while it guarantees that the significance level is cor- 
rect, the test loses its diagnostic power to detect a vi- 
olation of the null hypothesis. For some test statistics, 
procedures are known to obtain tests that have the cor- 
rect significance level as well as a good diagnostic power, 
see e.g. p2|-Q. It is not known to the authors, how to 



apply an analogous strategy to the Q{m) statistics. The 
main problem is that the correlations in the time series 
produced by the underlying dynamics of the process lead 
to correlations between the Q{m) statistics for different 
lags. Thus, the only cure known to the authors is to 
check whether the results of an analysis of one time se- 
ries can be reproduced by the analysis of independent 
measurements. Therefore, we subdivide our time series 
into segments of length 20,000 data points each and cal- 
culate the averaged Q{m) statistics and its confidence 
interval. 

To reveal the expected behavior of the Q{m) statistics 
for shot noise processes, we simulate an exponential shot 
noise process with intensity p — 0.1, r — 15, initial val- 
ues Mi drawn from a uniform distribution in the interval 
[0,1], and apply the Q{m) statistics. Figure |^a shows a 
segment of the simulated data. Figure gb and c display 
the Q{m) statistics and the significances S{m) for differ- 
ent lags m based on a realization of the process of length 
20,000 data points. The monotonically decaying behav- 
ior of the 8(171) curve does not depend on the intensity, 
the relaxation time or the distribution of the shot noise 
process. Of course, the quantitative behavior does. Clas- 
sical shot noise and first order linear stochastic dynami- 
cal systems can not be discriminated by linear methods 
since their spectra coincide. The simulation shows that 
higher order statistical properties allow for a discrimi- 
nation. Next we apply this concept to the analysis of 
measured data. 



IV. RESULTS 

We discuss the results for the time series of the inter- 
mediate state in detail. For the linear analysis, the results 
for the intermediate and low state data are comparable. 
Differences for the non linear analysis will be presented in 
more detail in Section IV B. 



A. Linear analysis by state space models 

We fit finear state space models (LSSM), Eq. (|1C 
of increasing dimension to segments of the intermediate 
state time series of length 20,000. In accordance with 
the results of |^ for the low state, the residual variance 
is constant for all models of dimension larger than zero. 
Furthermore, the analysis reveals an equal contribution 
of signal and noise to the total variance of the time series. 

Figure ^ displays the periodogram of the first segment 
and the spectra calculated from fitted one- to three- 
dimensional models on a log-linear and on a log-log scale. 
The spectrum calculated from the fitted parameters well 
explains the over-all periodogram of the data. Further- 
more, there is no significant difference between the spec- 
tra of fitted different dimensional processes. The relax- 
ation time of the scalar model is 14.2 sampling units cor- 
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responding to 55 ms. The Kolmogorov-Smirnov test does 
not reject the hypothesis of white noise residuals at the 
1% level of confidence. 

With respect to the dimension of the model, a fit of 
LSSMs of dimension one to three to the remaining 31 seg- 
ments confirms the result for the first segment. For the 
pieces of 20,000 data points as well as for the whole data 
set the spectra calculated from the estimated parameters 
do not differ from the spectra of the scalar model. The 
estimated relaxation times range from 12.4 to 17.4 sam- 
pling units, corresponding to 48 to 68 ms. For the data 
set from low state the qualitative results of the linear 
analysis are the same as for the intermediate state, but 
the relaxation times range from 40 to 56 sampling units, 
corresponding to 150 to 220 ms, in accordance with the 
results reported in Ref. 1^. 

Linear analysis methods, like spectral analysis, only 
capture the second order statistical properties of a pro- 
cess. For linear processes the higher order properties are 
a function of the second order correlations. This does not 
hold for nonlinear processes. Therefore, it could be possi- 
ble, that there is some nonlinear dynamics at work in the 
process under investigation which is invisible for a linear 
analysis. If such nonlinear dynamics can be described 
by Eq. (^), it can be concluded that its dimension is 
not larger than one. Any higher dimensional continuous- 
time system would have led to a difference between the 
spectra of the one and the higher dimensional LSSMs, 
since it would produce linear correlations for an order of 
at least the dimension of the process. In the same line 
of argument, a nonlinear first order dynamical process 
should have effected the higher order spectra. Thus, the 
linear analysis strongly suggests a linear stochastic first 
order process for a description of the data in the frame 
of dynamical systems. 



First, 



B. Nonlinear analysis 

we apply the surrogate data based se arch for 

to 



deviations from linearity as described in Section IIIF 



segments of length 20,000 up to a maximum lag of 1000 
sampling units corresponding to 3.9 s of the observation. 
We use 100 surrogate data sets to estimate the mean 
and the variance of the Q{m) statistics, Eq. ( p^ for the 
null hypothesis of linearity to calculate the significances 
5(to), Eq. (|l|). 

For the first segment, at above lag 800 the significance 
S{m) of the Q(rn) statistics for time reversibility results 
in a value larger than 4 (Figure |4^) . This corresponds to 
a probability for the null hypo thesis smaller than 10^^ 



As discussed in Section 



IIIF 



the results of the non- 



linear analysis by the surrogate data method using the 
Q{m) statistics has to be based on the consistency of the 
results for independent measurements due to the multiple 
testing problem. Figures ^-d display the results for the 
following 20,000 data point segments of the time series. 



There is no consistent deviation from the null hypothesis 
for any lag. 

Linear analysis reveals that the signal to noise ratio 
is equal to one if measured in relative amplitudes. This 
large amount of observational noise diminishes the diag- 
nostic power of the surrogate data test to dete ct a p ossi- 
ble time irreversibility. As discussed in Section III B| , the 
LSSM can be applied to estimate the noise-free dynami- 
cal time series within a Bayesian framework. Figure |^ dis- 
plays the results for the Kalman (and smoothing)-filtered 
data based on the one-dimensional LSSM analogous to 
Fig. ^. For large lags no significant changes appear apart 
from a smoother behavior of the curve which results from 
the low-pass filter pr operty of the estimation procedure as 
discussed on Section III B , But for small lags the behav- 
ior of the curves changes: Figure ^ shows the significances 
S{m) of the Q{m) statistics for the first 100 lags. Con- 
sistently, a significant deviation from linearity is found 
for exactly those lags up to the time scale of approxi- 
mately 15 sampling units that was found as typical time 
scale by the linear analysis. Note that the resulting S{m) 
curves for the Kalman-filtered data resemble the decay- 
ing curve expected for a shot noise model. Fig. H, while 
the raw data suggest a maximum at around 10 sampling 
units. The similarity of the results for larger time scales 
and the differences for short time scales can be inter- 
preted in the frame of shot noise models. For lags much 
larger than the relaxation time of the shots, the data 
are independent and the Q{m) statistics is expected to 
vanish. The appearance of the S{m) is determined by 
correlated fluctuations, as discussed in Section [II F . For 



time scales smaller than the relaxation time of the shots, 
the (3(m) statistics is significantly different from zero, 
see Fig. |[ The difference between the results for the raw 
and the Kalman-filtered data is an effect of the lag de- 
pendent signal to noise ratio. This is most pronounced 
for the shortest lags, since the time-course of each shot is 
continuous, but the observational noise is discontinuous, 
leading to a decreasing signal to noise ratio for smaller 
lags. This is the reason why S{m) tends to zero for lags 
close to zero for the raw data. 

Since the Kalman filter is linear, it is not expected 
to lead to artificial results. This has been confirmed in 
a simulation study. We use the fitted one-dimensional 
LSSM to generate data and calculated the significance 
S{m) of the Q{m) statistics for these data and data ob- 
tained by the Kalman filter. The results are displayed in 
Fig. 1^ and show that the Kalman filter does not produce 
spurious results for processes that are time reversible. 
Simulation studies using shot noise processes with added 
observational noise show that the Kalman-filtered data 
reproduce the behavior of the S{m) curve for shot noise 
processes as displayed in the Fig. ||. Thus, the significant 
results are not due to the Bayesian estimation by the 
Kalman filter (Section p^II B ). This is reasonable since in 
the worst case this linear filtering "pulls" the data in the 
direction of behaving more linear. That means that an 
existing time irreversibility would be decreased, but no 
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spurious significant effects are introduced. 

Figure |g shows the mean and 2(t confidence region 
of the significance S{m) of the Q{m) statistics obtained 
from the 32 segments of length 20,000 based on the raw 
and the noise-reduced time series from the intermediate 
state. Figure ^ displays the corresponding plot for the 
19 segments from the low state time series. For both 
data sets the S{m) curves for the raw and the Kalman- 
filtered time series are statistically indistinguishable for 
larger lags. Significant differences arise only for small 
lags. Based on the analysis of the raw data, any kind 
of shot noise model would be rejected. For the analysis 
based on the Kalman-filtered data, the S{m) curve for 
the low state time series suggests a classical shot noise 
model by its decay for small lags and insignificant val- 
ues for larger lags, compare Fig. 0. For the intermediate 
state time series, a significant maximum occurs at a lag 
TO of 30 sampling units, corresponding to 117 ms. This 
maximum cannot be reproduced by a simple shot noise 
model a nd ca lls for more complex processes discussed in 
Section IlII A. 



For both time series, our analysis shows that the lin- 
ear state space model is not an appropriate model to 
describe the data, since the significant time reversibili- 
ties calculated based on the fitted models contradict the 
assumption of these models. It is, however, important to 
note that the LSSM can be used to perform an efficient 
noise reduction. 



V. DISCUSSION 

We have developed methods and have discussed how it 
is possible to decide based on measured data whether a 
time series that even comprises a large amount of additive 
observational noise has been produced by a scalar linear 
stochastic dynamical system or a shot noise process. We 
have shown that linear spectral analysis does not allow 
for a discrimination. The nonlinear property of time irre- 
versibility of shot noise processes form the basis for a sig- 
nificant distinction. A straightforward evaluation of this 
feature is hampered by the statistical problem of mul- 
tiple testing and effects of additive observational noise. 
We have discussed how these problems can be overcome. 

We have applied methods from linear and nonlinear 
time series analysis to two X-ray variability lightcurves 
of the black hole candidate Cygnus X-1. The first time 
series was recorded while Cygnus X-1 was in an inter- 
mediate state the second represents the low state. 
Such data are usually described by shot noise models, 
a specific kind of point processes. Although point pro- 
cesses are fundamentally different from dynamical sys- 
tems, they share some properties with the latter. First, 
the spectrum of the classical shot noise process coincides 
with that of a scalar continuous time linear Gaussian 
stochastic process. Second, most shot noise models share 
the property of most nonlinear dynamical systems of be- 



ing time irreversible. 

Firstly, we have fitted linear state space models 
(LSSM) of increasing dimension to segments of the data. 
The variance of the prediction residuals is not decreasing 
for models of dimension larger than zero and the spec- 
tra calculated from the fitted parameters of the different 
models coincide, suggesting a scalar dynamical model. 
Testing the consistency of the prediction residuals with 
white noise has revealed a good over-all fit. The linear 
analysis shows that if the process is a dynamical system, 
it is linear and one-dimensional. Any higher dimensional 
or continuous-time nonlinear dynamical systems would 
have led to differences between one and higher dimen- 
sional LSSMs with respect to the spectra calculated from 
the fitted parameters and the variance of the prediction 
residuals. Furthermore, the analysis suggests a signal to 
noise ratio of one. 

Fitting a LSSM to data in the time domain is asymp- 
totically equivalent to fitting its spectrum to the peri- 
odogram of the data in the frequency domain | |55| . The 
spectrum of the classical shot noise process is identi- 
cal with the spectrum of a first order linear dynamical 
process. Thus, even if a goodness-of-fit test in the fre- 
quency domain does not reject a LSSM, no discriminat- 
ing conclusions can be drawn with respect to the question 
whether a dynamical system or a shot noise process has 
generated the data. Therefore, astrophysical interpreta- 
tions of the parameters of fitted LSSMs should 
be treated with care. 

Astrophysical studies indicate that the processes un- 
der investigation follow some kind of shot noise model 
|p|-p|,^| jl^ , p^ , |5^ -|60| . In general, shot noise models are not 
reversible in time. Surrogate data testing for time irre- 
versibility for different lags introduces the multiple test- 
ing problem. Therefore, we have investigated whether 
consistent results could be obtained from an analysis of 
segments of the time series. 

For the raw data of the low state time series, no 
significant deviation from linearity has been detected. 
However, we have found a double well behavior of the 
Q{m) statistics in the case of the intermediate state data 
(Fig. ^). Both results contradict a simple shot noise 
model. This might have been caused by the low sig- 
nal to noise ratio. In the frame of Bayesian estimation 
based on a fitted LSSM, we have applied the Kalman 
filter to get a noise-reduced time series. Based on these 
noise-reduced data, we have found a significant deviation 
from linearity at that time scale found by linear analy- 
sis that are in accordance with results for simulated data 
from a simple shot noise model. While the results for 
the low state time series are in agreement with a sim- 
ple shot noise model with independently decaying shots, 
the intermediate state time series shows a more complex 
behavior. Apart from the decay for small lags the sig- 
nificances show an additional distinct maximum. Our 
results are based on the estimated noise-reduced time 
series obtained by the LSSM. Any noise reduction proce- 
dure imposes assumptions about the underlying process 



7 



and might lead to artifacts if the assumptions are not met 
as in the present study. In the case considered here a vio- 
lation of the assumptions of the model, in the worst case, 
leads to less significant results since the filter is linear. 
Thus, the procedure is statistically conservative even if 
the model is misspecified. 

By its qualitative difference to the results for simple 
shot noise models for the intermediate state time series, 
the Q{m) statistics as a measure for time irreversibility 
poses a constraint on astrophysical models for this phe- 
nomenon. It has been shown that the classical shot noise 
model (0-^ does not satisfactory describe the process 
under consideration Therefore, one has to search for 
more complex models. For such models the significance 
of the Q{m) statistics (Fig. ^ provides an additional and 
independent test beyond the usually applied energy and 
power spectra. For example, our results exclude shot 
noise models with symmetrical rise and decay of the shots 
as discussed in [Q, since such models would not lead to 
a violation of time reversibility. In general, one has to 
Kalman-filter the data generated by the proposed model 
in the same way as the observed data and test the com- 
patibility of the resulting S{m) curve statistically. 

No explicit test to decide whether a dynamical system 
or a shot noise process underlies a measured time series is 
known to the authors. Summarizing the results from the 
linear and the nonlinear time series methods, the analysis 
strongly suggests that a shot noise model is at work. This 
is in accordance with astrophysical considerations: X- 
rays undergo multiple Compton scattering in the corona 
of hot electrons surrounding Cygnus X-1. The shots rep- 
resent the projection of this spatio-temporal, reaction- 
diffusion like processes on the time axis. The loss of spa- 
tial resolution causes that the resulting process cannot 
be formulated as a dynamical system anymore. This re- 
veals an interesting aspect of surrogate data testing that 
might also apply for other applications Q]. Initially, 
testing by surrogates was introduced to support the de- 
tection of chaotic dynamics. Later, it was recognized that 
a rejection of the null hypothesis of linear, stochastic, sta- 
tionary, Gaussian dynamics does not necessarily indicate 
chaos, i.e., a special type of nonlinear, stationary, de- 
terministic dynamics, since there are other possibilities 
to violate the assumptions of the above null hypothesis 
[|6l|-|64t. Furthermore, surrogate data testing was charac- 
terized as not too informative if simple inspection of the 
data reveals a deviation from the null hypothesis |^ . In 
the present case, the linear analysis looks promising at 
first sight rendering the surrogate data test informative. 
But here, the reason for a significant surrogate data test 
is not chaotic nonlincarity, but the projection from the 
spatio-temporal into the temporal domain. Thus, the X- 
ray variability data offer a new possibility for a rejection 
of the null hypothesis of a linear dynamical system: The 
system is not a dynamical system of the form x = f{x,e) 
at all. 

In summary, following a quotation of G.E.P. Box: "All 



models are wrong, but some are useful" , we propose the 
use of the misspecified linear state space model together 
with the measure of time reversibility inspired by nonlin- 
ear dynamics as an additional test to the usually applied 
energy and power spectra to evaluate the validity of as- 
trophysical shot noise models on measured data. 
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FIGURE CAPTIONS 

Fig. |l| A 3 s segment of the intermediate state time series. 

Fig. ^ Analysis of a simulated shot noise process, (a) Seg- 
ment of a reaUzation of an exponential shot noise 
process with intensity p = 0.1 and decay time 
r = 15 sampling units, (b) The Q(m) statistics, 
Eq. (|l|). (c) Significances S{m), Eq. (y). 

Fig. ^ Periodogram of the data (dots) and spectra (soUd 
lines) calculated from the estimated parameters of 
the state space model of dimension one to three 
in log- linear scale (top) and log- log scale (bottom) . 
Note that the spectra are virtually indistinguish- 
able. 

Fig. ^ Significances S{m) of the Q(m) statistics for lags 
up to 1000. (a) First segment of the intermediate 
state data set. (b-d) Results for the second to the 
fourth segment. 

Fig. ^ Significances S{m) analogous to Fig. ^. Dashed 
lines: Results for the raw data. Solid lines: Results 
for data after Kalman-filtering. 

Fig. ^ Significances S{m) of the Q(m) statistics for lags 
up to 100. Dashed lines: Results for the raw data. 
Solid lines: Results for data after Kalman-filtering. 

Fig. Results from a simulation study using the fitted 
LSSM. The significances S{m) of the Q{m) statis- 
tics are calculated for the raw data (solid line) and 
the data after Kalman-filtering (dashed line). 

Fig. ^ Significances S{m) of the Q{m) statistics and 2(t 
confidence regions calculated from the 32 segments 
of length 20,000 of intermediate state time series. 
Dashed line: Raw data. Solid line: Kalman-filtered 
data. 

Fig. ^ Significances S[m) of the Q{m) statistics and twice 
the standard error calculated from the 19 segments 
of length 20,000 of the low state time series. Dashed 
line: Raw data. Solid line: Kalman-filtered data. 
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